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Abstract 

In this paper we study events with W+jets final state, produced in double parton (DP) interactions, as a background 
to the associated Higgs boson (H) and W production, with H — > bb decay, at the Tevatron. We have found that the 
event yield from the DP background can be quite sizable, which necessitates a choice of selection criteria to separate 
the HW and DP production processes. We suggest a set of variables sensitive to the kinematics of DP and HW 
events. We show that these variables, being used as an input to the artificial neural network, allow one to significantly 
improve a sensitivity to the Higgs boson production. 
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I. INTRODUCTION 

A significant amount of experimental data, ranging from ISR energies [1] through the SPS [2] to the Tevatron 
[3-8], and even to photoproduction at HERA [9, 10], shows clear evidence of hard jets produced from multiple parton 
interactions (MPI). Specifically, in the Tevatron Run I and Run II studies, 4-jet [3] and 7 + 3-jet events [4, 7] have 
been considered with jet pt > 5 — 15 GcV, and the fraction of events occurring due to double parton (DP) interactions 
have been measured. Those fractions varied depending on the final state and the jet transverse momentum (pt) of 
the second parton interaction. The fraction measured using 4-jet final state is found to be 5.5% for jet pt > 25 GeV 
[3]. The fractions obtained from the 7 -I- 3-jet production range from 51.3% for the second (ordered in pT) and third 
jet p T in the interval 5-7 GeV 1 [4] to 47% - 22% for the second jet p T within 15 - 30 GeV [7]. 

Those experiments have also measured the effective cross section a e s, an important parameter that contains in- 
formation about the parton spatial density inside the (anti)proton: a c s = 12. l!^ mb in the 4-jet production in 
CDF [3], er c ff = 14.5±1.7±2.I mb and <r cS = 16.4 ± 0.3 ± 2.3 mb in the 7 + 3-jet productions in CDF [4] and DO [7], 
respectively. This parameter allows the calculation of a DP cross section a dp for any pair of partonic processes A 
and B according to: 

a DP = m . (1) 

Ceff 

The factor m has a Poissonian nature [11] and should be equal to 1/2 for two indistinguishable processes (like two 
dijet productions in A and B) or gives unity for distinguishable processes. The CDF [4] and DO [7] experiments 
obtained the most accurate results on a e g with an average value of about a^g e = 15.5 mb. 

In addition to information about parton spatial structure, those studies also pointed out that the DP interactions 
can be a noticeable background to many rare processes, especially for those with multijet final state. In this case an 
additional partonic interaction, producingmost likely a dijet final state, can mimic the multijet signal signature. Some 
estimates of the DP background to the Higgs boson production processes at the LHC have been done in [12-15]. 

In this paper we consider the DP events, caused by the VF+dijet production, as a background to the HW production, 
with W —¥ Iv and H — s- bb decays, which is one of the most promising Higgs boson search channels at the Tevatron. 
An example of a possible DP process with W + bb production is shown in figure 1. However, in addition to the 
two-&-jet final state produced in the second parton scattering, we also expect significant contribution from final states 
with light+heavy flavor and two light jets. 




Figure 1: A possible diagram for W + bb production due to DP scattering. 

Due to the similarity of HW and HZ final states, we expect that the relative DP background from Z+dijet 
production to the HZ events should be quite close to the HW case. For this reason, we limit our study to DP 
background to the HW events only. 

This paper is organized as follows. In section II we describe how DP and Higgs boson samples are simulated and 
selected. In section III we calculate differential cross sections da/dMjj (where Mjj is the invariant mass of the two 
leading jets) and event yields in the HW and DP processes including the jet energy detector smearing and 6-jet 
identification effects. The rates of events with W+2-jet production due to the DP and conventional single parton 
(SP) scatterings are compared in section IV. In section V we introduce a set of variables sensitive to the kinematics of 
the signal HW(Z) and DP background final states and use them as an input to a dedicated Artificial Neural Network 
(ANN) to separate the two event types. We make our conclusions in section VI. 



In this measurement jet is raw, i.e. uncorrected for the energy losses [4]. 



da/dMjj cross sections for HW, HZ and double parton events 
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II. SIMULATION AND SELECTIONS 



A. 



Selections 



The current PYTHIA event generator [16] is the best framework to study many effects related to MPI production. 
It includes a few sophisticated phenomenological models which consider the MPI scatterings with their various cor- 
relations, including parton momentum and color. The MPI models in pythia 6, have been tuned to experimental 
results, and reproduce many observables in data quite well [11, 17]. pythia 8, which inherited the majority of features 
of its predecessor, also allows the combination of different kinds of parton processes in the first (main) and second 
scatterings within kinematic regions of interest. To simulate events for the study wc used pythia 8 with Tune 2C as 
an MPI model 2 . The HW production channel simulated with Higgs boson masses of tuh = 115 and 150 GeV was 
considered. The DP scattering was simulated as inclusive qq — >• W + X production in the first parton process and 
inclusive QCD dijet production in the second process. To increase statistics in the selected final states with the cuts 
above, the W scattering process is required to have invariant mass 50 < mw < 120 GeV and the minimal allowed 
parton transverse momentum (p™ m ) in the dijet process is required to be p™ n = 10 GeV. 

The event selection criteria are taken from [18] and applied to both, the HW and DP production events and briefly 
summarized below: 

• The Higgs boson is required to decay into bb. 

• The W-boson is selected in the electron and muon decay modes with lepton pt > 15 GeV and pscudorapidity 
\r)\ < 1.1 or 1.5 < \t)\ < 2.5 for electrons and \r]\ < 1.6 for muons. 

• The total vector sum pr of neutrinos should be > 20 GeV (an approximate analog of missing Et > 20 GeV in [18]). 

• At least two jets are required with pr > 20 GeV and |f?| < 2.5. Jets are found by the DO Run II midpoint cone 
algorithm with radius i?=0.5 [19]. For this aim we used the fast jet package [20] interfaced to pythia 8. 

• The scalar sum of the jet transverse momenta (HT) is required to be HT > 60 GeV for the 2 jet final state and 
HT > 80 GeV for the 3 jet one. 



The cross sections of the simulated events were normalized to either experimentally measured cross sections or to 
theoretical NNLO predictions. Specifically, we normalized all the pythia cross sections in the following way: 

• We simulated dijet events production and calculated cross sections in the dijet mass bins 150 — 175 and 175 — 200 
GeV, and the two rapidity regions of \y\ < 0.4 and 0.4 < \y\ < 0.8 available from the recent DO measurement [21]. 
We have found that a required PYTHlA-to-data correction factor ("K-factor") is about 1.26, approximately valid for 
both the dijet mass bins and the two rapidity regions. 

• We also simulated separately W inclusive production and, from a comparison of its cross section with the DO and 
CDF measurements [22, 23], have obtained a PYTHlA-to-data K-factor of about 1.5. 

• The HW cross section has been normalized to the NNLO predictions [24] with the PYTHiA-to-NNLO K-factor equal 
to 1.45. 

• We corrected the effective cross section a e s used in Tune 2C 3 by a factor 1.6 to match the CDF and DO measurements 
[4, 7] with averaged result <7^° = 15.5 mb. 

The uncertainty assigned in our analysis to the K- factors are 10% and 16% to <x|g e . The latter is due to the 
difference between the DO and CDF <x e ff central values (~ 7%) and the systematic uncertainties (~ 14%) in the DO 
measurement. 



In this section we calculate he differential cross sections da/dMjj for the HW and DP (VF+dijet) events selected 
according to the criteria of section II. To match the detector resolution, the jet transverse momenta px are smeared 



2 This tune was suggested by the pythia authors. 

3 The effective cross section <r e g in PYTHIA 8 is taken as a ratio of a total non-diffractive cross section to an impact-parameter enhancement 
factor, depending on the parton spatial density distribution. 



B. Normalizations 



III. da/dMjj CROSS SECTIONS FOR HW AND DOUBLE PARTON EVENTS 



A. 



HW and DP cross sections 



da/dMjj cross sections for HW, HZ and double parton events 
using 



'Pi 



' C, (2) 



PT ^/PT 

where S = 0.75 and C = 0.06 which approximately reproduces the jet px resolution for the DO detector [25]. The 
differential cross sections da/dMjj for the HW and DP productions including the smearing effect are shown in figure 
2. In addition to the total DP cross section, contributions from the main DP scattering subprocesses are also shown 
in a separate plot. One can see from these two plots that (a) the DP cross section dominates the HW signal by more 
than two orders of magnitude, and (b) the DP cross section is caused mainly by the W+2 light jets (stemming from 
u/d/s-quarks or gluons) production, followed, in the order of importance, by contributions from W + gc, W + gb, and 
then by W + cc and W + bb events. 




M:: (GeV) M:: (GeV) 



Figure 2: The differential cross sections in the dijet mass Mjj bins for signal HW and background DP events including the 
jet pr resolution. On the left plot, dotted and dash-dotted red lines correspond to HW events with m(H) — 115 and 150 
GeV, respectively, while the full black line shows the total background from all the DP IT+dijet channels. The right plot shows 
contributions from main parton scattering subprocesses composing the total DP background. 



B. Account of 6-jet identification efficiencies 



In the signal HW events we have two b jets in the final state. Since the leading DP background is caused by the 
W+2 light jet events (figure 2), we should expect a significant reduction after requiring of jet 6-tagging. To check this 
numerically, we apply a specific 6-tagging requirement for the HW and DP events. In our fast MC we cannot check 
the jet ^-tagging quality, but we instead use the efficiencies to pass the ^-tagging requirements for light (I), c and b 
jets. We take these efficiencies from [26], where they are parametrized as functions of jet px and rj. These efficiencies 
are used to rc-wcight events according to the jet flavors. Typical efficiencies are 50 — 70% for 6-jets, 8 — 12% for c-jets 
and 0.5 — 2% for Z-jets. The variations reflect dependence on the jet pr, r\ and tightness of the 6-tagging condition. We 
consider a given jet to be a 6-jet if it has a 6-quark in the jet cone; if the jet does not have a 6-quark but has a c-quark 
instead, it is considered to be a c-jet; otherwise it is a light jet. Figure 3 shows the cross sections x 6-jet identification 
efficiency (e^Ly) f° r the DP and HW events, where each of the two jets is required to satisfy the "loose" 6-tagging 
requirement [26]. This requirement significantly suppresses rates of the DP events. However, the signal rates are also 
noticeably reduced (compare figures 2 and 3). For this reason, in practice, double tagging is usually combined with 
single tagging. For example, in the search for HW signal [18], two cases of the 6-tagging are considered: either an 
event should contain two jets satisfying "loose" 6-tagging requirements or, if it fails, a single jet should satisfy the 
"tight" requirement. Fractions of background (=data) and the HW events selected with the single 6-tagging can be 
taken from [18]: they are about 85% and 60% correspondingly 4 . The remaining events are with two 6-tagged jets. 



4 Clearly, here we assume that the jet flavor content of the background events in data and the dijet events from the DP interaction is the 
same. However, we believe that for the current level of estimates this assumption should be good enough. 



da/dMjj cross sections for HW, HZ and double parton events 



Figure 4 shows the cross sections xe J b e ^ id for the DP and HW events where we have combined events with single and 
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Figure 3: The differential cross sections in the dijet mass bins for signal HW and background DP events including the jet pr 
resolution and requirement of the two jet 6-tagging (See also description in the caption to figure 2). 



double 6-tagging according to their fractions mentioned above. We see that while the dominating DP channel is still 
caused by the W+2 light jet production, the relative contribution from W + gb production is now much higher than 
in hgurc 2 (no 6-tagging is applied). The W + gb contribution is followed by similar ones from the W + gc and W + bb 
events. 

Figure 5 is complementary to figure 4 and shows the ratios of the HW event yield to the inclusive DP PF+dijet one 
in the dijet mass Mjj bins for the events selected by the combined 5-tagging. The uncertainty in each bin is caused 
by the K-factors and effective cross section (section II). 

One can see that the Higgs boson events with m# = 115 GeV are expected to be suppressed by about a factor 3 
(S/B ~ 0.35) in the peak position, while the signal events with tuh = 150 GeV are suppressed by about a factor 7. 

It is interesting to compare the total number of the signal events predicted by our fast MC after all selections (figure 
4) with those in [18] for the integrated luminosity Li nt = 5.3 fb . It is obtained by integrating the cross section 
over the whole Mjj range (20-400 GeV) and multiplying by Li nt . In such a way we have found the expected signal 
statistics of about 31 (7) events for m# = 115 (150) GeV. According to [18] there should be about 19 ± 1 selected 
events for tuh = 115 GeV. Our estimate seems to be in a reasonable agreement if we take into account the effects of 
finite lepton identification, jet taggability efficiencies, and detector acceptance unaccounted in our fast MC. 
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Figure 4: The differential cross sections in the dijet mass bins for signal HW and background DP events including the jet pr 
resolution and the combined jet 6-tagging efficiency (see also the main text and the caption to figure 2). 
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Figure 5: The ratio of HW signal to DP background event yields with the combined 6-tagging (see the main text). 



IV. COMPARISON OF THE DP AND SP EVENT YIELDS 



In this section we compare the event yields dN/dMjj expected for the DP and SP W+2-jet productions. The two 
additional jets in the SP events come from radiation effects in the initial and/or final states. SP events are simulated 
using qq —> Wg and qg — > Wq subprocesses and applying the HW selection criteria from section II. To reproduce 
the inclusive W+2 jet cross section in data [27], the pythia events are reweighted with a scaling factor depending on 
the second jet pr, what increases the pythia W+2 jet cross section in the region 110 < Mjj < 160 GeV by about a 
factor 2. Also, as before, the jet pr is smeared according to the pt resolution, cq. (2) and the events are weighted 
with the jet 6-tagging efficiencies according to the jet flavors. 

The estimated total event yields in the whole mass region at L int = 5.3 fb _1 for SP and DP events are about 5212 
and 262 events, respectively. The differential ratios of the DP/SP VF+2-jet event yields in the Mjj bins are shown in 



figure 6. They are about 5 



for Mjj ~ 115 GeV and 3.5 - 6% for M H ~ 150 GeV. 
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Figure 6: The ratio of the DP to SP event yields for the Vy+2-jet production. 



Artificial neural network for DP and HW(Z) separation 



7 



V. ARTIFICIAL NEURAL NETWORK FOR DP AND HW(Z) SEPARATION 

A. Variables 



In this section we discuss variables that can be useful to separate the HW(Z) signal from the DP W^(Z)+dijet 
background events. Most of these variables are either based on the previous relevant experimental studies [1-7] or 
have been suggested in theoretical papers [11, 28-34]. Due to the similarity of HW and HZ events, most of these 
variables should be useful to suppress the DP background events to both the final states (with some exclusions). 
Definitions of all the variables are summarized below. 

• The first variable is an azimuthal angle between two pt vectors, where the first one corresponds to the W(Z) pt 
vector, while the second one is a sum of the leading and second jet pt vectors: 

AS= ^{p T [V], PTpetxJeta]), (3) 

where Pt[^] is the transverse momentum vector of V(= W, Z)-boson and pr[jeti, jet 2 ] = p T ctl + p T et2 . For historical 
reasons [1-4, 7] we call this angle AS. 

• The second variable is the difference between the rapidity of the F-boson and the total rapidity of the two-jet 
system: 

Ar/(V,jetl2) = \t] V - (rf etl + rf ct2 )\. (4) 

• The variable A7/(V, jctl2) can be calculated just for V = Z events, but not for W due to the missing p z information 
of the v. Instead we can use the rapidity of the electron (e) rf from the W decay and introduce an analogous variable: 

Ar ] (e,]etl2) = \r l e -(rf etl +rf ei2 )\. (5) 

• In the case of the W production the azimuthal angle between the electron from the W decay and the leading jet 
A0(e,jetl) can be considered. 

Two other variables use angular differences between the first and second jets: 

• the azimuthal angle between the jets A0(jetl, jet2). 

• the difference between rapidities of the first and second jets A?y(jetl, jet2). 

• Another variable characterizes the orientation of the two event planes, one contains the beam (proton) axis and 
F-boson, and the other one contains the two jets [35]: 

(p v x pP mton ) • (p' ctl x p J° t2 ) 
cosV>*(V,jetl2) = i - v - — mi — ^-J-- (6) 

r v ' J ' \p V x g proton | . \jj jctl x p jet2 v > 

• In the case of the W production, we do not have the 3-vector of the W momentum but can use the electron 3-vector 
instead, i.e. we should calculate cosV>*(e, jetl2). 

Three other variables are based on the jet Pt- 

• the total sum of the first and second jet px- 

^ uml2 =^ tl +P J T ° t2 . (7) 

• the relative difference between the first and second jet pt'- 

pdiS12 = (p jctl _ jetlyjjpmU (g) 



the total pt sum of all jets, pf^ 



sum 



All 



• Finally, we add the total number of all jets (pr > 6 GeV), Nj ets . 



All these 12 variables are shown in figures 7 and 8 for HW and DP W^+dijet events. They demonstrate a good 
separation power between the two event types. 



B. ANN 



The variables presented above can be used as input to a dedicated ANN to separate the HW from the DP events. 
The variable p™ mAn is very correlated with p™ m12 , but the latter is a bit more sensitive to the signal/background 
difference. We do not use the dijet mass information to minimize dependence on a specific Higgs boson mass region 
but rather concentrate on other more generic kinematic properties of the two event types. 



Artificial neural network for DP and HW(Z) separation 
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We have chosen the following 9 variables to train the ANN: AS, Arj(e, jetl2), A</>(e, jetl), A^(jetl, jet2), 
A?7(jetl,jet2), cosip*(e,jetl2), p^ m12 , p^ lS12 , and iVj e t s , using the package jetnet [36]. The ANN is trained us- 
ing the signal HW (simulated with m# = 115 GcV) and background DP events to produce a single output value 
equal to zero for the background and unity for the signal events. The DP background events for the training (and 
later for testing) purposes are selected around the Higgs boson Mjj peak position taking all events within ±2cr around 
the peak. We have trained the ANN using 200,000 the signal and background events and then tested the ANN using 
50,000 events that have not been used at the training stage. The normalized distributions of the signal and background 
events for the ANN output Onn is presented in figure 9. The ANN weights obtained at the training stage, have been 
used later to also separate the HW signal simulated with mjj = 150 GeV and DP events. 

Tighter cuts on the ANN output will reject a larger fraction of the DP events. Figure 10 shows the correlation 
between efficiencies to select the background and signal events (e^ NN and £^ NN , respectively) for the two Higgs boson 
masses, m# = 115 GeV and mjj = 150 GeV. Selecting 90% (80%) of the signal events with m# = 115 GeV we will 
keep only about 24% (13%) of the DP events, while selecting 90% (80%) of the signal events with m# = 150 GeV we 
will keep about 9% (4%) of the DP events. 

C. Results 

The built ANN is used to further suppress the DP background, which strongly dominates the signal events even 
after the b-tagging selections (figure 5). The new signal-to-background ratios are shown in two plots of figure 11, 
corresponding to the choice of the HW signal efficiencies e^ NN = 90% and 80%. The ratios at er^ NN = 90% for both 
the mass regions, 115 GeV and 150 GeV, are now close to 1.3 — 1.5. This ratio grows further with e^ NN = 80%, and 
reaches about 2.2 at M,-, ~ 115 GcV and about 2.7 at Mi, ~ 150 GeV. 



VI. CONCLUSION 

In our current study we have shown that the W+dijet events produced due to the DP scattering can compose a 
quite sizable background to the associated HW production with H — > bb decay. Its relative fraction with respect to 
the traditional background from SP scattering with the VF+2-jct final state is found to be 4 — 8% in the dijet mass 
region 115 < Mjj < 150 GeV. We suggest a set of the angular and jet px variables that are sensitive to the difference 
between the HW and DP kinematics. The neural network built using these variables allows significant suppression 
of the DP background to a desirable level. Provided that the overall systematics in the Higgs searches in the HW 
channel will go down in a time and since every percent of the background events matters, use of the suggested anti-DP 
neural network should be very helpful. 
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Figure 7: Normalized distributions of the number of HW signal (full red line) and W+dijets background (dashed black line) 
events over the kinematic variables of section V A (part 1). 
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